Covering Number as a Complexity Measure for POMDP Planning and Learning

نویسندگان

Zongzhang Zhang

Michael L. Littman

Xiaoping Chen

چکیده

Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable from the initial belief point, has interesting links with the complexity of POMDP planning, theoretically. In this paper, we present empirical evidence that the covering number for the reachable belief space (or just “covering number”, for brevity) is a far better complexity measure than the state-space size for both planning and learning POMDPs on several small-scale benchmark problems. We connect the covering number to the complexity of learning POMDPs by proposing a provably convergent learning algorithm for POMDPs without reset given knowledge of the covering number.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Covering Number for Efficient Heuristic-based POMDP Planning

The difficulty of POMDP planning depends on the size of the search space involved. Heuristics are often used to reduce the search space size and improve computational efficiency; however, there are few theoretical bounds on their effectiveness. In this paper, we use the covering number to characterize the size of the search space reachable under heuristics and connect the complexity of POMDP pl...

متن کامل

What makes some POMDP problems easy to approximate?

Point-based algorithms have been surprisingly successful in computing approximately optimal solutions for partially observable Markov decision processes (POMDPs) in high dimensional belief spaces. In this work, we seek to understand the belief-space properties that allow some POMDP problems to be approximated efficiently and thus help to explain the point-based algorithms’ success often observe...

متن کامل

Submission Category: Reinforcement Learning, Preference: ORAL Approximate Planning in Large POMDPs via Reusable Trajectories

We consider the problem of reliably choosing a near-best strategy from a restricted class of strategies in a partially observable Markov decision process (POMDP). We assume we are given the ability to simulate the POMDP, and study what might be called the sample complexity — that is, the amount of data one must generate in the POMDP in order to choose a good strategy. We prove upper bounds on t...

متن کامل

Approximate Planning in Large POMDPs via Reusable Trajectories

متن کامل

Covering Number: Analyses for Approximate Continuous-state POMDP Planning (Extended Abstract)

To date, many theoretical results on discrete POMDPs have not yet been extended to continuous-state POMDPs, due to the infinite dimensionality of the belief space in a continuousstate case. In this paper, we define a distance in the `nmetric space with respect to a partitioning representation of the continuous-state space, and formalize the size of the search space reachable under inadmissible ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Covering Number as a Complexity Measure for POMDP Planning and Learning

نویسندگان

چکیده

منابع مشابه

Covering Number for Efficient Heuristic-based POMDP Planning

What makes some POMDP problems easy to approximate?

Submission Category: Reinforcement Learning, Preference: ORAL Approximate Planning in Large POMDPs via Reusable Trajectories

Approximate Planning in Large POMDPs via Reusable Trajectories

Covering Number: Analyses for Approximate Continuous-state POMDP Planning (Extended Abstract)

عنوان ژورنال:

اشتراک گذاری